Search for: All records

Creators/Authors contains: "Fox, Geoffrey"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Surrogate modeling of Cellular-Potts agent-based models as a segmentation task using the U-Net neural network architecture

https://doi.org/10.1371/journal.pcbi.1013626

Comlekoglu, Tien; Toledo-Marín, J Quetzalcóatl; Comlekoglu, Tina; DeSimone, Douglas W; Peirce, Shayn M; Fox, Geoffrey; Glazier, James A (November 2025, PLOS Computational Biology)
Maini, Philip K (Ed.)
The Cellular-Potts model is a powerful and ubiquitous framework for developing computational models for simulating complex multicellular biological systems. Cellular-Potts models (CPMs) are often computationally expensive due to the explicit modeling of interactions among large numbers of individual model agents and diffusive fields described by partial differential equations (PDEs). In this work, we develop a convolutional neural network (CNN) surrogate model using a U-Net architecture that accounts for periodic boundary conditions. We use this model to accelerate the evaluation of a mechanistic CPM previously used to investigatein vitrovasculogenesis. The surrogate model was trained to predict 100 computational steps ahead (Monte-Carlo steps, MCS), accelerating simulation evaluations by a factor of 562 times compared to single-core CPM code execution on CPU. Over short timescales of up to 3 recursive evaluations, or 300 MCS, our model captures the emergent behaviors demonstrated by the original Cellular-Potts model such as vessel sprouting, extension and anastomosis, and contraction of vascular lacunae. This approach demonstrates the potential for deep learning to serve as a step toward efficient surrogate models for CPM simulations, enabling faster evaluation of computationally expensive CPM simulations of biological processes.
more » « less
Full Text Available
Conditioned quantum-assisted deep generative surrogate for particle-calorimeter interactions

https://doi.org/10.1038/s41534-025-01040-x

Toledo-Marín, J Quetzalcóatl; Gonzalez, Sebastian; Jia, Hao; Lu, Ian; Sogutlu, Deniz; Abhishek, Abhishek; Gay, Colin; Paquet, Eric; Melko, Roger G; Fox, Geoffrey C; et al (December 2025, npj Quantum Information)

Abstract Particle collisions at accelerators like the Large Hadron Collider (LHC), recorded by experiments such as ATLAS and CMS, enable precise standard model measurements and searches for new phenomena. Simulating these collisions significantly influences experiment design and analysis but incurs immense computational costs, projected at millions of CPU-years annually during the high luminosity LHC (HL-LHC) phase. Currently, simulating a single event with Geant4 consumes around 1000 CPU seconds, with calorimeter simulations especially demanding. To address this, we propose a conditioned quantum-assisted generative model, integrating a conditioned variational autoencoder (VAE) and a conditioned restricted Boltzmann machine (RBM). Our RBM architecture is tailored for D-Wave’s Pegasus-structured advantage quantum annealer for sampling, leveraging the flux bias for conditioning. This approach combines classical RBMs as universal approximators for discrete distributions with quantum annealing’s speed and scalability. We also introduce an adaptive method for efficiently estimating effective inverse temperature, and validate our framework on Dataset 2 of CaloChallenge.
more » « less
Full Text Available
Exploring the energy landscape of RBMs: reciprocal space insights into bosons, hierarchical learning and symmetry breaking

https://doi.org/10.1088/2632-2153/adf521

Toledo-Marin, J Quetzalcóatl; Maiti, Anindita; Fox, Geoffrey C; Melko, Roger G (August 2025, Machine Learning: Science and Technology)

Abstract Deep generative models have become ubiquitous due to their ability to learn and sample from complex distributions. Despite the proliferation of various frameworks, the relationships among these models remain largely unexplored, a gap that hinders the development of a unified theory of AI learning. In this work, we address two central challenges: clarifying the connections between different deep generative models and deepening our understanding of their learning mechanisms. We focus on Restricted Boltzmann Machines (RBMs), a class of generative models known for their universal approximation capabilities for discrete distributions. By introducing a reciprocal space formulation for RBMs, we reveal a connection between these models, diffusion processes, and systems of coupled bosons. Our analysis shows that at initialization, the RBM operates at a saddle point, where the local curvature is determined by the singular values of the weight matrix, whose distribution follows the Marc̆enko-Pastur law and exhibits rotational symmetry. During training, this rotational symmetry is broken due to hierarchical learning, where different degrees of freedom progressively capture features at multiple levels of abstraction. This leads to a symmetry breaking in the energy landscape, reminiscent of Landau’s theory. This symmetry breaking in the energy landscape is characterized by the singular values and the weight matrix eigenvector matrix. We derive the corresponding free energy in a mean-field approximation. We show that in the limit of infinite size RBM, the reciprocal variables are Gaussian distributed. Our findings indicate that in this regime, there will be some modes for which the diffusion process will not converge to the Boltzmann distribution. To illustrate our results, we trained replicas of RBMs with different hidden layer sizes using the MNIST dataset. Our findings not only bridge the gap between disparate generative frameworks but also shed light on the fundamental processes underpinning learning in deep generative models.
more » « less
Full Text Available
CaloBench: A Benchmark Study of Generative Models for Calorimeter Showers

https://doi.org/10.1007/978-981-96-5032-3_5

Ahmad, Farzana Yasmin; Venkataswamy, Vanamala; Fox, Geoffrey (April 2025, Springer Nature Singapore)
Lin, Weiwei; Jia, Zhen; Hunold, Sascha; Kang, Guoxin (Ed.)
The pursuit of understanding fundamental particle interactions has reached unparalleled precision levels. Particle physics detectors play a crucial role in generating low-level object signatures that encode collision physics. However, simulating these particle collisions is computational and memory intensive which will be exasperated with larger data volumes, more complex detectors, and a higher pileup environment in the High-Luminosity Large Hadron Collider. The introduction of Fast Simulation has been pivotal in overcoming computational and memory bottlenecks. The use of deep-generative models has sparked a surge of interest in surrogate modeling for detector simulations, generating particle showers that closely resemble the observed data. Nonetheless, there is a pressing need for a comprehensive evaluation of the performance of such generative models using a standardized set of metrics. In this study, we conducted a rigorous evaluation of three generative models using standard datasets and a diverse set of metrics derived from physics, computer vision, and statistics. Furthermore, we explored the impact of using full versus mixed precision modes during inference. Our evaluation revealed that the CaloDiffusion and CaloScore generative models demonstrate the most accurate simulation of particle showers, yet there remains substantial room for improvement. Our findings identified where the evaluated models fell short in accurately replicating Geant4 data.
more » « less
Full Text Available
Generative diffusion model surrogates for mechanistic agent-based biological models

https://doi.org/10.1088/2632-2153/ae11f8

Comlekoglu, Tien; Quetzalcoatl_Toledo-Marín, J; DeSimone, Douglas W; Peirce, Shayn M; Fox, Geoffrey; Glazier, James A (October 2025, Machine Learning: Science and Technology)

Abstract Mechanistic, multicellular, agent-based models are commonly used to investigate tissue, organ, and organism-scale biology at single-cell resolution. The Cellular-Potts Model (CPM) is a powerful and popular framework for developing and interrogating these models. CPMs become computationally expensive at large space- and time- scales making application and investigation of developed models difficult. Surrogate models may allow for the accelerated evaluation of CPMs of complex biological systems. However, the stochastic nature of these models means each set of parameters may give rise to different model configurations, complicating surrogate model development. In this work, we leverage denoising diffusion probabilistic models (DDPMs) to train a generative AI surrogate of a CPM used to investigatein vitrovasculogenesis. We describe the use of an image classifier to learn the characteristics that define unique areas of a 2-dimensional parameter space. We then apply this classifier to aid in surrogate model selection and verification. Our CPM model surrogate generates model configurations 20,000 timesteps ahead of a reference configuration and demonstrates approximately a 22x reduction in computational time as compared to native code execution. Our work represents a step towards the implementation of DDPMs to develop digital twins of stochastic biological systems.
more » « less
Full Text Available
Youmu: Efficient Columnar Data Pipeline for LLM Training

Zhong, Tianle; Zhao, Jiechen; Su, Qiang; Fox, Geoffrey (February 2025, https://openreview.net/forum?id=I2LF8QHaua)

Large language models (LLMs) training is extremely data-intensive, often involving over trillion-level tokens. Although LLM datasets are usually ingested and stored in columnar formats, they often need to be converted into another format for training, which incurs significant storage and maintenance costs due to extra data copies. While eliminating the conversion would save tens of terabytes of space in costly high performance storage, this work identifies challenges that drive us to re-think the entire data pipeline. Without conversion, we find that fine-grained random access patterns incur hundreds of times efficiency drops. Specifically, the existing data pipelines have two fundamental drawbacks: (1) They cannot efficiently support directly digesting data in columnar format due to default coarse-grained I/O; (2) Solutions to the first drawback sacrifice memory footprint to cache datasets. In this paper, we present Youmu, a new data pipeline that directly feeds fine-grained columnar data into GPUs, enabling cost-efficient LLM training. Meanwhile, Youmu maintains high training accuracy, whose perplexity outperforms widely adopted local shuffle by reducing 0.3-0.7 for pretraining. Compared to performance-optimal state-of-the-art, distributed memory-based pipelines, Youmu achieves comparable throughput with 80% less memory footprint.
more » « less
Full Text Available
ROSE: RADICAL Orchestrator for Surrogate Exploration

https://doi.org/10.1145/3731599.3767347

Alsaadi, Aymen; Wang, Tianle; Park, Andrew; Bajracharya, Pradeep; Wang, Linwei; Sun, Fanbo; Seal, Sudip; Jadhao, Vikram; Fox, Geoffrey; Jha, Shantenu (November 2025, ACM)

Full Text Available
Deep RC: A Scalable Data Engineering and Deep Learning Pipeline

Sarker, Arup; Alsaadi, Aymen; Halpern, Alexander; Tangella1, Prabhath; Titov, Mikhail; Perera, Niranda; Staylor, Mills; von_Laszewski, Gregor; Jha, Shantenu; Fox, Geoffrey (June 2025, 28th edition of the workshop on Job Scheduling Strategies for Parallel Processing. JSSPP 2025 https://jsspp.org/)

Significant obstacles exist in scientific domains including genetics, climate modeling, and astronomy due to the management, preprocess, and training on complicated data for deep learning. Even while several large-scale solutions offer distributed execution environments, open-source alternatives that integrate scalable runtime tools, deep learning and data frameworks on high-performance computing platforms remain crucial for accessibility and flexibility. In this paper, we introduce Deep Radical-Cylon(RC), a heterogeneous runtime system that combines data engineering, deep learning frameworks, and workflow engines across several HPC environments, including cloud and supercomputing infrastructures. Deep RC supports heterogeneous systems with accelerators, allows the usage of communication libraries like MPI, GLOO and NCCL across multi-node setups, and facilitates parallel and distributed deep learning pipelines by utilizing Radical Pilot as a task execution framework. By attaining an end-to-end pipeline including preprocessing, model training, and postprocessing with 11 neural forecasting models (PyTorch) and hydrology models (TensorFlow) under identical resource conditions, the system reduces 3.28 and 75.9 seconds, respectively. The design of Deep RC guarantees the smooth integration of scalable data frameworks, such as Cylon, with deep learning processes, exhibiting strong performance on cloud platforms and scientific HPC systems. By offering a flexible, high-performance solution for resource-intensive applications, this method closes the gap between data preprocessing, model training, and postprocessing.
more » « less
Full Text Available
Deep RC: A Scalable Data Engineering and Deep Learning Pipeline

Sarker, Arup; Alsaadi, Aymen; Halpern, Alexander; Tangella, Prabhath; Titov, Mikhail; Perera, Niranda; Staylor, Mills; Laszewski, Gregor von; Jha, Shantenu; Fox, Geoffrey (June 2025, Springer. JSSPP 2025: Job Scheduling Strategies for Parallel Processing)

Significant obstacles exist in scientific domains including genetics, climate modeling, and astronomy due to the management, preprocess, and training on complicated data for deep learning. Even while several large-scale solutions offer distributed execution environments, open-source alternatives that integrate scalable runtime tools, deep learning and data frameworks on high-performance computing platforms remain crucial for accessibility and flexibility. In this paper, we introduce Deep Radical-Cylon(RC), a heterogeneous runtime system that combines data engineering, deep learning frameworks, and workflow engines across several HPC environments, including cloud and supercomputing infrastructures. Deep RC supports heterogeneous systems with accelerators, allows the usage of communication libraries like \texttt{MPI}, \texttt{GLOO} and \texttt{NCCL} across multi-node setups, and facilitates parallel and distributed deep learning pipelines by utilizing Radical Pilot as a task execution framework. By attaining an end-to-end pipeline including preprocessing, model training, and postprocessing with 11 neural forecasting models (PyTorch) and hydrology models (TensorFlow) under identical resource conditions, the system reduces 3.28 and 75.9 seconds, respectively. The design of Deep RC guarantees the smooth integration of scalable data frameworks, such as Cylon, with deep learning processes, exhibiting strong performance on cloud platforms and scientific HPC systems. By offering a flexible, high-performance solution for resource-intensive applications, this method closes the gap between data preprocessing, model training, and postprocessing.
more » « less
Full Text Available
Time Series Foundation Models and Deep Learning Architectures for Earthquake Temporal and Spatial Nowcasting

https://doi.org/10.3390/geohazards5040059

Jafari, Alireza; Fox, Geoffrey; Rundle, John B; Donnellan, Andrea; Ludwig, Lisa Grant (November 2024, GeoHazards)

Advancing the capabilities of earthquake nowcasting, the real-time forecasting of seismic activities, remains crucial for reducing casualties. This multifaceted challenge has recently gained attention within the deep learning domain, facilitated by the availability of extensive earthquake datasets. Despite significant advancements, the existing literature on earthquake nowcasting lacks comprehensive evaluations of pre-trained foundation models and modern deep learning architectures; each focuses on a different aspect of data, such as spatial relationships, temporal patterns, and multi-scale dependencies. This paper addresses the mentioned gap by analyzing different architectures and introducing two innovative approaches called Multi Foundation Quake and GNNCoder. We formulate earthquake nowcasting as a time series forecasting problem for the next 14 days within 0.1-degree spatial bins in Southern California. Earthquake time series are generated using the logarithm energy released by quakes, spanning 1986 to 2024. Our comprehensive evaluations demonstrate that our introduced models outperform other custom architectures by effectively capturing temporal-spatial relationships inherent in seismic data. The performance of existing foundation models varies significantly based on the pre-training datasets, emphasizing the need for careful dataset selection. However, we introduce a novel method, Multi Foundation Quake, that achieves the best overall performance by combining a bespoke pattern with Foundation model results handled as auxiliary streams.
more » « less
Full Text Available

« Prev Next »